This article proposes a new acoustic model using decision trees (DTs) as replacements for Gaussian mixture models\r\n(GMM) to compute the observation likelihoods for a given hidden Markov model state in a speech recognition\r\nsystem. DTs have a number of advantageous properties, such as that they do not impose restrictions on the\r\nnumber or types of features, and that they automatically perform feature selection. This article explores and\r\nexploits DTs for the purpose of large vocabulary speech recognition. Equal and decoding questions have newly\r\nbeen introduced into DTs to directly model gender- and context-dependent acoustic space. Experimental results\r\nfor the 5k ARPA wall-street-journal task show that context information significantly improves the performance of\r\nDT-based acoustic models as expected. Context-dependent DT-based models are highly compact compared to\r\nconventional GMM-based acoustic models. This means that the proposed models have effective data-sharing\r\nacross various context classes.
Loading....